17 research outputs found

    A Direct Translation from XPath to Nondeterministic Automata

    Get PDF
    Abstract. Since navigational aspects of XPath correspond to first-order definability, it has been proposed to use the analogy with the very successful technique of translating LTL into automata, and produce efficient translations of XPath queries into automata on unranked trees. These translations can then be used for a variety of reasoning tasks such as XPath consistency, or optimization, under XML schema constraints. In the verification scenarios, translations into both nondeterministic and alternating automata are used. But while a direct translation from XPath into alternating automata is known, only an indirect translation into nondeterministic automata- going via intermediate logics- exists. A direct translation is desirable as most XML specifications have particularly nice translations into nondeterministic automata and it is natural to use such automata to reason about XPath and schemas. The goal of the paper is to produce such a direct translation of XPath into nondeterministic automata.

    Run-Based Semantics for RPQs

    Full text link
    The formalism of RPQs (regular path queries) is an important building block of most query languages for graph databases. RPQs are generally evaluated under homomorphism semantics; in particular only the endpoints of the matched walks are returned. Practical applications often need the full matched walks to compute aggregate values. In those cases, homomorphism semantics are not suitable since the number of matched walks can be infinite. Hence, graph-database engines adapt the semantics of RPQs, often neglecting theoretical red flags. For instance, the popular query language Cypher uses trail semantics, which ensures the result to be finite at the cost of making computational problems intractable. We propose a new kind of semantics for RPQs, including in particular simple-run and binding-trail semantics, as a candidate to reconcile theoretical considerations with practical aspirations. Both ensure the output to be finite in a way that is compatible with homomorphism semantics: projection on endpoints coincides with homomorphism semantics. Hence, testing the emptiness of result is tractable, and known methods readily apply. Moreover, simple-run and binding-trail semantics support bag semantics, and enumeration of the bag of results is tractableComment: 35 page

    Schema Mappings for Data Graphs

    Get PDF

    Asymptotic Determinacy of Path Queries using Union-of-Paths Views

    Get PDF
    International audienceWe consider the view determinacy problem over graph databases for queries defined as (possibly infinite) unions of path queries. These queries select pairs of nodes in a graph that are connected through a path whose length falls in a given set. A view specification is a set of such queries. We say that a view specification V determines a query Q if, for all databases D, the answers to V on D contain enough information to answer Q. Our main result states that, given a view V, there exists an explicit bound that depends on V such that we can decide the determinacy problem for all queries that ask for a path longer than this bound, and provide first-order rewritings for the queries that are determined. We call this notion asymptotic determinacy. As a corollary, we can also compute the set of almost all path queries that are determined by V

    Consistency of injective tree patterns

    Get PDF
    International audienceTesting if an incomplete description of an XML document is consistent, that is, if it describes a real document conforming to the imposed schema, amounts to deciding if a given tree pattern can be matched injectively into a tree accepted by a fixed automaton. This problem can be solved in polynomial time for patterns that use the child relation and the sibling order, but do not use the descendant relation. For general patterns the problem is in NP, but no lower bound has been known so far. We show that the problem is NP-complete already for patterns using only child and descendant relations. The source of hardness turns out to be the interplay between these relations: for patterns using only descendant we give a polynomial algorithm. We also show that the algorithm can be adapted to patterns using descendant and following-sibling, but combining descendant and next-sibling leads to intractability

    Datalog Rewritings of Regular Path Queries using Views

    Get PDF
    We consider query answering using views on graph databases, i.e. databases structured as edge-labeled graphs. We mainly consider views and queries specified by Regular Path Queries (RPQ). These are queries selecting pairs of nodes in a graph database that are connected via a path whose sequence of edge labels belongs to some regular language. We say that a view V determines a query Q if for all graph databases D, the view image V(D) always contains enough information to answer Q on D. In other words, there is a well defined function from V(D) to Q(D). Our main result shows that when this function is monotone, there exists a rewriting of Q as a Datalog query over the view instance V(D). In particular the rewriting query can be evaluated in time polynomial in the size of V(D). Moreover this implies that it is decidable whether an RPQ query can be rewritten in Datalog using RPQ views

    A Researcher’s Digest of GQL

    Get PDF
    International audienceGQL (Graph Query Language) is being developed as a new ISO standard for graph query languages to play the same role for graph databases as SQL plays for relational. In parallel, an extension of SQL for querying property graphs, SQL/PGQ, is added to the SQL standard; it shares the graph pattern matching functionality with GQL. Both standards (not yet published) are hard-to-understand specifications of hundreds of pages. The goal of this paper is to present a digest of the language that is easy for the research community to understand, and thus to initiate research on these future standards for querying graphs. The paper concentrates on pattern matching features shared by GQL and SQL/PGQ, as well as querying facilities of GQL

    Graph Pattern Matching in GQL and SQL/PGQ

    Get PDF
    As graph databases become widespread, JTC1 -- the committee in joint charge of information technology standards for the International Organization for Standardization (ISO), and International Electrotechnical Commission (IEC) -- has approved a project to create GQL, a standard property graph query language. This complements a project to extend SQL with a new part, SQL/PGQ, which specifies how to define graph views over an SQL tabular schema, and to run read-only queries against them. Both projects have been assigned to the ISO/IEC JTC1 SC32 working group for Database Languages, WG3, which continues to maintain and enhance SQL as a whole. This common responsibility helps enforce a policy that the identical core of both PGQ and GQL is a graph pattern matching sub-language, here termed GPML. The WG3 design process is also analyzed by an academic working group, part of the Linked Data Benchmark Council (LDBC), whose task is to produce a formal semantics of these graph data languages, which complements their standard specifications. This paper, written by members of WG3 and LDBC, presents the key elements of the GPML of SQL/PGQ and GQL in advance of the publication of these new standards

    View-based query determinacy and rewritings over graph databases

    No full text
    Les graphes de donnĂ©es sont naturellement utilisĂ©s dans de nombreux contextes incluant par exemple les rĂ©seaux sociaux ou le Web sĂ©mantique. L'information contenue dans la base de donnĂ©es se trouve alors aussi bien dans les donnĂ©es mĂȘmes que dans la topologie du graphe, c'est-Ă -dire dans la maniĂšre dont les donnĂ©es sont connectĂ©es. Cela implique donc de considĂ©rer les questions traditionnelles en thĂ©orie des bases de donnĂ©es pour des langages de requĂȘtes capables de parler des chemins connectant les nƓuds du graphe. Nous nous intĂ©ressons en particulier aux problĂšmes de la dĂ©terminabilitĂ© et de la rĂ©Ă©criture d'une requĂȘte Ă  l'aide de vues. Il s'agit alors de dĂ©cider si une vue de la base de donnĂ©es contient suffisamment d'information pour rĂ©pondre entiĂšrement Ă  une requĂȘte sans consulter la base de donnĂ©es directement, et dans ce cas, d'exprimer explicitement la rĂ©ponse Ă  la requĂȘte Ă  partir de la vue. Ce cadre rencontre de nombreuses applications, notamment pour l'intĂ©gration de donnĂ©es et l'optimisation de requĂȘtes. Nous commençons par comparer ces deux questions aux autres problĂšmes de dĂ©cision classiques dans ce contexte : calcul des rĂ©ponses certaines, test de cohĂ©rence et mise Ă  jour d'une instance de vue. Nous amĂ©liorons ensuite ces rĂ©sultats dans deux cas spĂ©cifiques. Tout d'abord, nous montrons que pour les requĂȘtes rĂ©guliĂšres de chemin, l'existence d'une rĂ©Ă©criture monotone coĂŻncide avec l'existence d'une rĂ©Ă©criture dans Datalog. Puis, nous montrons que pour des vues s'intĂ©ressant uniquement aux longueurs des chemins du graphe, une notion plus faible de dĂ©terminabilitĂ©, appelĂ©e dĂ©terminabilitĂ© asymptotique, est dĂ©cidable et rĂ©sulte en des rĂ©Ă©critures du premier ordre.Graph databases appear naturally in various scenarios, such as social networks and the semantic Web. In these cases, the information contained in the database lies as much in the data itself as in the topology of the graph, that is, in how the data points are linked together. This leads to considering traditional database theory questions for query languages that return data nodes based on the paths of the graph connecting them. We focus our attention on the view-based query determinacy and rewriting problems. They ask the question whether a view of the database contains enough information to fully answer a query without accessing the database directly. If so, we then want to express the answer to the query directly with regards to the view. This setting occurs in many applications, such as data integration and query optimization. We start by comparing these two tasks to other common task in this setting: computing certain answers, checking consistency of a view instance and updating it. We then build on these results in two specific cases. First, we show that for regular path queries, the existence of a monotone rewriting coincides with the existence of a rewriting expressible in Datalog. Then, we show that for views that only consider the lengths of the path in the graph, we can decide a weaker form of determinacy, called asymptotic determinacy, and produce first-order rewritings for the queries that are asymptotically determined
    corecore